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ABSTRACT. In this paper modified variants of the sparse Fourier transform algorithms from ['14 ] are presented which im- 
prove on the approximation error bounds of the original algorithms. In addition, simple methods for extending the improved 
sparse Fourier transforms to higher dimensional settings are developed. As a consequence, approximate Fourier transforms 
are obtained which will identify a near-optimal Ic-term Fourier series for any given input function, / : [0, 2n] D — > (D, in 
O (k 2 ■ D 4 ) time (neglecting logarithmic factors). Faster randomized Fourier algorithm variants with runtime complexities 
that scale linearly in the sparsity parameter k are also presented. 

1. Introduction 

This paper develops fast methods for finding near-optimal nonlinear approximations to the Fourier transform of a 
given function / : [0, 2n] D — » <D. Suppose that f is a bandlimited function so that f e <C N , where N D is large. An 
optimal fc-term trigonometric approximation to / is given by 

(1) f(x>XM)^" f 

where co\, . . . , co^d e [1 - N/2, N/2] D n WP are ordered by the magnitudes of their Fourier coefficients so that 

> |/(w 2 )| > • • • > \f((3 ND )\. 

The optimal A:-term approximation error is then \\f - /^ pt ||2 = II/ — jj^'lb- Suppose k e N is given. The goal of this 
paper is to develop Fourier approximation schemes that are guaranteed to always return a near-optimal trigonometric 
polynomial,^ : [0, 2n] D — > <C N ° , having ||/— i/jtlta ~ ll/-/^ pt ||2- Furthermore, we require that the developed schemes 
are fast, with runtime complexities that scale polylogarithmically in N D and at most quadratically in k. Such Fourier 
algorithms will then be able to accurately approximate the Fourier transform of an arbitrarily given function (i.e., 
with no a priori assumptions regarding "smoothness") much more quickly than a standard Fast Fourier Transform 
(FFT) methods [8, 3| whenever N D >> k is large. More specifically, the developed schemes will lead to Fourier 
approximation algorithms with runtime complexities that scale polynomially in D, as opposed to exponentially. 

The Fourier approximation techniques developed in this paper are improvements of the techniques introduced in 
fl4l . As an example, suppose for simplicity that / : [0, 27t] — > <D is a bandlimited function of only one variable so 
that / e <C N . Furthermore, let k < N be given. The main theorem in [ 14] implicitly proves that 0(k 2 log 4 N) function 
evaluations and runtime are sufficient to produce a sparse approximation, j/jt, to / satisfying 

where /° pt is defined as in Equation Q] This error bound is unsatisfying for several reasons. Principally, if many 
of the Fourier coefficients of / are roughly the same magnitude the approximation error above can actually increase 
with k, the number of nonzero terms in the sparse approximation y^. If nothing else, we would like to improve 
these error guarantees so that additional computational effort can always be counted on to yield better sparse Fourier 
approximations. 

Let p, cj 6 [1, oo). We will say that y 6 C N satisfies an V, l^/k 1 ^' 1 ^ error bound with respect to / e € N if 
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More generally, we will refer to any error bound of the form given in Equation [2] as an instance optimal error bound 
for /. In this paper the result discussed in the previous paragraph is improved by showing that 0(k 2 log N) function 
samples and runtime are sufficient to produce a sparse approximation satisfying an I 2 , 1 1 / yfk error bound with respect 
to the Fourier transform of any N-bandlimited function / : [0, 2n\ — » <D. This decreases the " Vfc ||/ - /^'[jj" 
term in the previous error bound [14] by a multiplicative factor of k. Furthermore, faster randomized methods are 
also presented which are capable of achieving the same type of approximation errors typically achieved by slower 
algorithms based on the restricted isometry property f6] QT) with high probability, despite utilizing a similar number 
of function samples. 

1.1. Results and Related Work. Over the past few years, results concerning matrices with the Restricted Isometry 
Property (RIP) have allowed methods to be developed which can accurately approximate the Fourier transform of 
a function despite being given access to only a very small number of samples. Informally, an m X N matrix M 
has the RIP of order k e N if it acts as a near isometry for all vectors, x e <C N , which contain at most k nonzero 
entries. Particularly important for our purposes is that RIP matrices of order 2k serve as good measurement matrices 
for sparsely approximating vectors in <C N . Suppose M is an m X N matrix with the RIP of order 2k. Then, for 
any x e <C N , a variety of computational methods including Z 1 -minimization 131 [5] |6), Orthogonal Matching Pursuit 
11271 IPTl . Regularized Orthogonal Matching Pursuit [20 21], Iterative Hard Thresholding (2), etc., will take Mx as 
input and subsequently output another vector, y e <D N , satisfying an instance optimal error bound with respect to x 
(e.g., an Z 2 ,/ 1 / Vfc error bound). Hence, any linear operator satisfying an appropriate RIP condition can serve as an 
efficient measurement operator capable of capturing sufficient information about any input vector in order to allow it 
to be accurately approximated. 

The most pertinent RIP result to approximate Fourier recovery as considered here states that a rectangular matrix 
constructed by randomly selecting a small set of rows from anNxN inverse discrete Fourier transform matrix will 
have the RIP with high probability. The following theorem was proven in [26] and subsequently generalized and 
improved in [24] . 



Theorem 1. (See H26II )■ Suppose we select m rows uniformly at random from the rescaled N X N Inverse Discrete 
Fourier Transform (IDFT) matrix -^W -1 , where 



e n 



and form the mxN submatrix At If m is Q (k ■ logN ■ log 2 k ■ log(ZclogN)) then M will have the RIP of order k 
with high probability. 



Let W be the N X N Discrete Fourier Transform (DFT) matrix defined by Wij = ^ • , / : [0,2n] -> C be 

a given function, and / e <D N be the vector of N equally spaced samples from / on [0, 2n]. In this case TheoremQ] 

-* 

tells us that collecting the m function samples determined by AW/ will be sufficient to accurately approximate the 

-> p» -* 

discrete Fourier transform of / with high probability. More precisely, if Mf = Av¥f is input to a recovery algorithm 

known as CoSaMP [ 19] the following theorem holds. 

Theorem 2. (See 0191 ). Suppose that M is a mxN measurement matrix fanned by selecting m = &(k ■ log 4 N) rows 
from the N X N IDFT matrix, ^F -1 , uniformly at random. Furthermore, assume that M satisfies the RIP of order 21^ 
Fix precision parameter r\ G 1R and let II = AV¥f be measurements collected for any given f £ (D . Then, when 
executed with U as input, CoSaMP will output a 2k-sparse vector, y € (D N , satisfying 



f-y 



< Const • max < tj, — — 

Vfc 
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where f k ° pt is a best possible k-term approximation for f = mf. The required runtime is O log N ■ log ^ / 



Note that this is true with high probability by TheoremfT] 
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Table 1 . Sparse Fourier Approximation Algorithms with Robust Recovery Guarantees 



In effect, Theorem [2] promises that CoSaMP will locate 2k of the dominant entries in / if given access to ®(k ■ 

log N) samples from /. If / contains 2k significant frequencies whose Fourier coefficients collectively dominate 
all others combined, then these most significant frequencies will be found and their Fourier coefficients will be well 

approximated. If / has no dominant set of 2k entries then CoSaMP will return a sparse representation which is 
guaranteed only to be trivially bounded. However, in such cases sparse Fourier approximation is a generally hopeless 
task anyways and a bounded, albeit poor, sparse representation is the best one can expect. In any case, as long as the 
random function samples correspond to a matrix with the RIP, CoSaMP will output a vector satisfying an instance 

optimal error bound with respect to /. However, the required runtime will always be Q(N). More generally, all 
existing Fourier recovery methods based on RIP conditions have superlinear runtime complexity in N. 

Other existing Fourier algorithms for approximating / e (D N given sampling access to / e (D N work by utilizing 
random sampling techniques [ iT2l[T3l . These approaches simultaneously obtain both instance optimal error guarantees, 
and runtime complexities that scale sublinearly in N. However, they generally also require more function samples than 
recovery algorithms which utilize matrices satisfying the RIP. A variant of the following Fourier sampling theorem, 

concerning the sparse approximation of / provided sampling access to / : [0, 2n] — > (D, is proven in ifTJl . 

Theorem 3. (See II 131 ). Fix precision parameters i[,x 6 1R + and probability parameter A e (0, 1). There exists a 
randomized sampling algorithm which, when given sampling access to an input signal f € (D , outputs a k-sparse 
representation y for f satisfying 



f-y 
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+ x ■ max < ?] 
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with probability at least 1 — A. Here f, opt is a best possible k-sparse representation for f. Both the runtime and 



sampling complexities are bounded above by 

■ (log Jog (^bog I W| 2 JogN, 1 



0(1) 



It is important to note that the probabilistic guarantee of recovering an accurate sparse representation provided by 
Theorem[3]is a nonuniform per signal guarantee. In contrast, Fourier approximation procedures which rely on RIP ma- 
trices provide uniform probability guarantees/or all possible input vectors. If a set of sample positions corresponds to 
an N X N IDFT submatrix with the RIP property, those sample positions will allow the accurate Fourier approximation 
of all possible input vectors / e <D N . 

In this paper several Fourier algorithms are developed which obtain instance optimal approximation guarantees 
while also improving on various aspects of the previously mentioned approaches. See Table [T] for a comparison 
of the results obtained herein with Theorems |2] and [3] when applied to the problem of approximating the Fourier 
transform, / e (D N , of an N-bandwidth function / : [0, 2n] — » C The first column of TableQ]lists the Fourier results 
considered, while the second column lists whether the recovery algorithm in question guarantees an instance optimal 
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output Deterministically (D), or With High Probability (w.h.p.) per signal. Note that CoSaMF0has an "« D" listed in 
its second column. This denotes that the RIP results utilized in Theorem [2] provide a uniform probability guarantee, 
although no explicit constructions of RIP matrices satisfying these bounds are currently known. The third and fourth 
columns of Table Q] contain the sampling and runtime complexities of the algorithms, respectively. For simplicity 
some of the bounds were simplified by ignoring precision parameters, etcQ Finally, the fifth column of Table [T]lists 
the instance optimal approximation guarantees achievable by each algorithm when budgeted the number of samples 
and time listed in the third and fourth columns. The "+rf' in the CoSaMP and Sparse Fourier rows remind us that their 
error bounds are good up to an additive precision parameter. 

The last row of Table[T]lists lower bounds for the runtime and sampling complexity of any algorithm guaranteed to 
achieve an instance optimal I 2 , 1 1 / Vk Fourier approximation error (see J7)). Note that all six approaches have sampling 
complexities containing additional multiplicative logarithmic factors of N beyond the stated lower sampling boundQ 
The lowest overall sampling complexity is achieved by Corollary [5] although, it is achieved at the expense of a weak 
nonuniform "w.h.p." approximation probability guarantee. Similarly, Corollary |4]improves on the previous sampling 
complexity of the sparse Fourier algorithm in lfl3ll while at least matching its runtime complexitjQ. Finally, to the 
best of the author's knowledge, Theorem [7] obtains the best available runtime of any existing deterministic Fourier 
approximation algorithm which is guaranteed to achieve an instance optimal error guarantee. 

The remainder of this paper is organized as follows: In Section [2] the notation utilized throughout the remainder 
of the paper is established. Next, in Section [3] a number theoretic matrix construction is presented and analyzed. 
Section lXTl explains how random submatrices of the presented number theoretic matrices can yield nonuniform prob- 
abilistic approximation guarantees, while Section 13.21 outlines a useful relationship between these matrices and the 
Fourier transform of a periodic function. In Section [4] the matrices defined in Section [3] are used to construct Fourier 
approximation algorithms with runtime complexities that scale superlinearly in N (i.e., Theorem [6] and Corollary [3] 
are proven). Next, in Section [5] the algorithms of Section @] are modified into algorithms with runtime complexities 
that scale sublinearly in N (i.e., Theorem|7]and Corollary [4] are proven). In Section|6]a simple strategy is given for 
extending the results of the previous two sections to higher dimensional Fourier transforms. Finally, a short conclusion 
is presented in Section|7] 

2. Notation and Setup 

Below we will consider any function whose domain, I, is both ordered and countable to be a vector. Let x : I — » <D. 
In this case we will say that i€ <D' 7 ' , and that x, = x(i) e <D for all i e I. We will denote the V norm of any such vector, 
x, by 



11*11, = £w 

V iel 



forp e [1, oo). 



If x is an infinite vector (i.e., if I is countably infinite), we will say that x e V if ||x*|| p is finite. Without loss of generality, 
we will assume that a given x e <D N is indexed by I = [0, N) PI % unless indicated otherwise. The vector e <C N 
will always denote the vector of N ones, and Ojv e <D N with always denote the vector of N zeros. 

For any given x e <D |Z| and subset S c I, we will let Xs e (D' J ' be equal to ion S and be zero everywhere else. Thus, 



(*s), 



Xj if i 6 S, 
otherwise 



Furthermore, for a given integer k < \I\, we will let S° pt C / be the first k element subset of I in lexicographical order 
with the property that |x s | > \xt\ for all s e S° pt and t e I — S° pt . Thus, S° pt contains the indexes of k of the largest 
magnitude entries in x. Finally, we will define £? pt to be x s o P t, a best fc-term approximation to x. 



•^We used CoSaMP as a representative for all RIP based recovery algorithms because, for the purposes of Table[T]at least, it matches the currently 
best achievable runtime, sampling, and error bound performance characteristics of all the other previously mentioned RIP-based methods in the 
Fourier setting. 

3 The 0(N ■ log N) runtime listed for Corollary [5] will hold if k is 0(N/ log 2 N). More generally, the runtime will always be 0(N • log 3 N). 
"*Both the sampling and runtime complexities of the Sparse Fourier algorithm presented in [131 scale like d(k ■ log 5 N). 

5 It must be remembered, however, that the algorithm presented in | ; 13] enjoys a stronger approximation error guarantee up to its addi- 
tive/multiplicative precision parameters. 



In this paper we will be considering methods for approximating the Fourier series of an arbitrarily given periodic 
function, f : [0, 2n] D — » (D. Following convention, we will denote the Fourier transform of f by / : TP — » (D, where 



(2tz) Jfe[0,2n] D 



<B- i<3 ' f /(f) dx for all we Z r 



Note that / can be considered an infinite vector indexed by HP . We also have the inverse relationship 

/(£) = f(co) e i£ * f for all xe [0,27z] D 

Thus, we learn / in the process of approximating its Fourier transform. 

Call each co e TL D a Fourier mode ox frequency, and / (co) its corresponding Fourier coefficient. Ultimately, we will 
restrict our attention to the Fourier modes of f inside some finite bandwidth. We will do this by identifying, and then 

estimating the Fourier coefficients of, the most energetic Fourier modes in (— , |^ Jj D iP for a given bandwidth 
value NeN. Toward this end, define the vector / e <C N by 

fa = /(<3) forall^ef- - ,[- 

Similarly, define f : WP — » (D to be the Fourier transform of the related optimal bandlimited approximation to f. More 
precisely, let 

ifd?e(-[fl,[fj] D n;£ D , 
otherwise 



for all co e K D . We will approximate f by approximating /. However, in order to do so we must first construct a 
special class of matrices. 



3. A Specialized Measurement Matrix Construction 

We consider m X N measurement matrices, M S i,k, constructed as follows. Select K pairwise relatively prime 
integers beginning with a given s\ e N and denote them by 



(3) 



Sl < • • • < Sjc. 



Produce a row ry,, where j e [1, K\ PI N and h e [0, S;) n N, in M S1i k for each possible residue of each s; integer. The 
n th entry of each ry, row, m e [0, N) n N, is given by 

(rjjdn = 6((n-/i)mods ; J 

1 \fn = h mod sy 
otherwise 



(4) 

We then set 

(5) 



ri,o 
^ rK,s K -i ) 

The result is an (m = YJj=\ s j) x N matrix with binary entries. See FigureQ]for an example measurement matrix. 

The matrices constructed above using relatively prime integers have many useful properties. As we shall see later 
in Section H] these properties cumulatively allow the accurate recovery of Fourier sparse signals. We require two 
additional definitions before we may continue. Let n e [0, N) Pi N. We define Ai Sl ,K,n to be the KxN matrix created 
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n e [0, N) n N 





n = mod 2 


r i 


n = 1 mod 2 





n = mod 3 


i 


n = 1 mod 3 





n = 2 mod 3 





n = 1 mod 5 






1 2 3 4 5 6 

10 10 1 
10 10 10 

1 1 

10 10 

10 10 

1 1 



Figure 1 . An Example Matrix created using sj = 2, s 2 = 3, S3 = 5, . . . 



by selecting the K rows of M Si ,k with nonzero entries in the 77 th column. Furthermore, we define M' Sl ,K,n t0 be the 
K X (N - 1) matrix created by deleting the n th column of M Sl ,K,n- Thus, we have 



(6) 



and 



M SuK ,n 



Y\ t n mod Si 
^2, n mod Si 



\ ?K, n mod Sk ) 



(7) Af Sl ,*,« 



( (ri n mod Si n mod si )i ... (n n mod Si )n-l fa h mod si)h+1 

fa ,n mod S2 )o fa ,n mod s 2 )i ... (r 2 ,n mod S2 )n-l fa ,« mod s 2 )rc+l 



,« mod si )n-1 
fa ,« mod S2 )n-i 



V fac « mod s K )o fac, ll mod s K h ■•• fa, « mod sk )n-l ('Xn mod s K )n+l ■ • ■ (7"K,« mod )n-1 J 
We have the following two lemmas. 

Lemma 1. Let n,k e [0,N) n N ant/ x e (D N_1 . 77zen, af most k [log Si Nj q/7/ze K entries ofM' Sl ,K,n • x will have 
magnitude greater than or equal to ||x||i/£ 

Proof: 



We have that 



< 7^- \\M' Sl ,K,n ■ 4 ^ ' k ■ HM' Sl ,K,„||l 



by the Markov Inequality. Focusing now on M' Slj K,n we can see that 



(8) \\M' Si ,k ,n\\ ,= max Y M M ' si,K ,n) ,; = max Y 5 ((n - I) mod s,) < I log N 

11 111 !s[0,N-l)nN 4— ' I ; ' /e[0,N-l)nN 1 v y L 1 • 

7=1 7=1 

by the Chinese Remainder Theorem (see 11221 ). The result follows. □ 



Lemma 2. Let n,ke [0, N) n N, S c [0, N) n N vwY/z |S| < k, and x e C N_1 . Then, M' Sl ,K,n ' x and M' S1i k,„ ■ (x - x s ) 
will differ in at most k |log Sl A/j of their K entries. 

Proof: 
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We have that 

{; (M' Sl ,K,n-^j*(M' Sl ,K,n-(x-x s ))^ = {/ (M Sj , Kn ■ x s )j * o} < {; (M' Sl ,K,n-(i N -i) s ).>l} 

since all the entries of M' Sl ,K,n are nonnegative integers. Applying Lemma[T]with x - (1n-i) s and k = |(ljv-i) s | = 
|S| finishes the proof. □ 

Combining these two Lemmas we obtain a general theorem concerning the accuracy with which we can approxi- 
mate any entry of an arbitrary complex vector x e <D N using only entries of M S1j k • x. 

Theorem 4. Let n, k, si e [0, N) n IN, e' 1 e N + , c e [2, oo) n N, and x e € N . Set K = c ■ (k/e) I log Si n| + 1. men, 

more fncm — • X of the K entries of M Sl ,K,n • * wi// estimate x n to within . 8 precision. 

Proof: 

Define y 6 (D N_1 to be y = (xq, Xl, . . . , X n -i, X n+ i, ... , 3Cn-i). We have that 

M Sl , K ,n -X = X n -1 K + M' Sl ,K,n ■ y- 

Applying Lemma [2] with k = (k/e) reveals that at most (k/e) [log Si A/j entries of M' Sll K,n ' V differ from M' Sx ,K,n ' 
(y - tffije))' ^ remamm 8 ^ ~~ (^/ e ) [log Si A/J entries of -M' Si ,k,b • 3A at most (fc/e) [log Si Nj will have magnitudes 
y ~~ V(k/e) I Lemma[T] Hence, at least 

K - 2(k/e) [log Si Nj > (c - 2)(fc/e) [log Si n\ + 1 > — ■ K 



greater than or equal to e 



entries of M' Sl ,K,n • y will have a magnitude no greater than 



€ ■ 



-> -opt 



-> -opt 

x ~~ x (;c/<?) 



The result follows. □ 

We will now study the number of rows, m = Yjj=\ s /' m our measurement matrix under the Theorem|4] assumption 
that K = c ■ (k/e) |log Si A/j + 1 for some constant integer c e [2, oo) and given values of S\ = (k/e),N € N + . Given 
this assumption concerning K, we wish to bound the smallest possible sum, m, resulting from all possible choices of 
pairwise relatively prime s; values. We will do this by bounding m for one particular set of S; values. 

Let pi be the Z th prime natural number. Thus, we have 

(9) V\ =2,p 2 = 3,p 3 =5,p 4 = 7 / ... 
Next, define ijelNso that 

(10) p q -i < (k/e) < Pr 

We will use the first K primes no smaller than (k/e) to define our relatively prime s; values for the purposes of bounding 
m. Hence, for the remainder of Section[3]we will have 



(11) 



Si = - < p q < S 2 = Pq+\ <■■■ <S K = p q+K -l- 



It follows from results in [ 15 1 that 

K K-l 

(12) rn = Yj S i-YuPi + i 



2h\p q+K 



l + O 



;=1 ;=0 

Furthermore, the Prime Number Theorem (see ll22l ) tells us that 



\np q+K )) 2\np q 



l + O 



hp, 



and 



*/, rt /lnln(ifc/e) 



Thus, if we use K = c ■ (k/e) ^log^j A/j + 1 in order to construct M(kj e ),K we will have 



c-k\ log tt , , N 
K= L 6(t/ e) £ | 1 + 

e V Vln 



Here we have assumed that (k/e) + K is less than AT. Applying the Prime Number Theorem once more we have that 

(13) p q+ K 
Utilizing Equation[T2]now yields 



c ■ k [log We) N\ ■ In (^n) C fin rn(^N) S 



K-l 



(14) 



7=0 



.fc 2 [log ( , /e) Nj 2 .ln(^) 
2e 2 



l + O 



(infamy 



, ln(^) / 



Hence, we have an asymptotic upper bound for the number of rows in Ai(k/ e vK- The next theorem, proven in Appen- 
dix|A] provides a concrete upper bound. 

Theorem 5. Suppose that N,k,e~ l 6 N - { 1} with N > k > 2. Then, if we set K = c ■ (k/e) I log Si Nj + 1 for some 
constant integer c G [2, oo), there exists an mxN measurement matrix, M Si ,k, with a number of rows 



m < 



3(c + 1.89) 2 ■ fc 2 [log {k/e) N\ 2 J(c + 1.89) • k [log (jc/e) n\ ) 



i-e 2 



Tighter upper bounds for the number of rows may be explicitly calculated using Equations]. 
Proof: See Appendix lAl □ 



-UEbelow. 



Theorems |4] and |5] collectively provide bounds for the number of rows a measurement matrix AI Si ,k may contain 



-> nopt 
X ~ X (k/e) 



These 



and still be able to estimate any entry of a vector x e (D N to within a precision proportional to 
bounds are universal in that they pertain to measurement matrices which are guaranteed to provide accurate estimates 
for all entries of all vectors x e <D N . In the next section we will prove the existence of a small number of A\ Si ,k rows 
which are guaranteed to provide precise estimates for any sufficiently small set of vector entries. We will also briefly 
consider a randomized matrix construction based on uniformly sampling rows of the deterministic M S1i k matrices 
considered above. These results will ultimately motivate the development of sparse Fourier transforms with reduced 
sampling requirements. 

3.1. Randomized Row Sampling and Existence Results. In this section we will consider submatrices of the mxN 
measurement matrices, Ai Slr K, discussed above. More specifically, we will be discussing matrices formed by selecting 
a small number of rows from an M Si ,k matrix as follows. Let S = \Sj ir Sj 2 , . . . , sy, J be a subset of the Sj values used to 
form M Si ,k ( see Equations|3]-|5]l. We will then define Ai$ to be the (rh = Lg =1 s /,,) x N matrix, 

( r h,o 



(15) 



Mc 



'n*h- 



V r M,-i ) 
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with each row defined as per Equation|4] Finally, for n e [0, N) Pi N, we define Alg „ to be the I X N matrix, 



(16) 



At 



rj^n mod Sjj 
mod s ; 7 



v ^" ji,tl mod s y ; 

along the lines of Equation|6] The following corollary of Theorem|4]demonstrates the existence of small submatrices 
of M Si ,k capable of providing accurate approximations to any given subset of a given vector x e (C N . 

Corollary 1. Let k,N,e~ l 6 N, S c [0,N) n N, ant/ x e € N . Set K = c ■ (k/e) llog Si Nj + 1/or Si 6 N and a 
constant integer c € [4, oo). Form an mx N measurement matrix M S1/ k as P er Section\3\ Then, there exists a subset 
ofO (log |S|) Sj values for M SuK , 

5 = { S/l ' S/2 '---' S W 2 ,M + Dl}' 

w;?/z the following property: For all n E S we have 



(M S n 5t-x n ■ 



-> -opt 

Y — T 

(Jfc/e) 



Proof: 

We proceed by induction on the size of S c [0, N) n N. For the base case we assume |S| = 1 and apply Theorem|4] 
with n set to the single element of S. We then define S to be a singleton set containing any one of the Sj rows of 
M Sl ,K,n which approximates x„ to the guaranteed precision. Now, suppose that the statement of Corollary Q] holds for 
all subsets S c [0, N) D N with |S| < a E N + . Let S' c [0,N) n N have |S'| < We will prove that the statement of 
CorollaryUholds for S'. 

For each n e S' and j e [1, K]nNwe will count a 'failure' if 



e • 



f° pt 

(k/e) 



(M sl ,K,nX)j - x n 

Theorem |4] tells us that there will be fewer than (2/c) ■ K 'failures' for each element of S', for a total of fewer than 

2 IS'I 

-^r- 1 • K collective 'failures' for all elements of S . Clearly, at least one of the K Sj values used to construct M Si ,k must 

2 IS'I 

'fail' for fewer than ■ elements of S . Let s . be the s, value which 'fails' for the smallest number of elements of S , 
and let S" C S' contain all the elements of S' for which s'. 'fails'. We can see that \S"\ < < a. Our induction 
hypothesis applied to S" together with the presence of s'j yields the desired result. □ 

Corollary Q] demonstrates the existence of a small number of Sj values which allow us to estimate every entry 
of a given vector. However, it is apparently difficult to locate these Sj values efficiently. The following corollary 
circumvents this difficulty by showing that a small set of randomly selected Sj values will still allow us to estimate all 
entries of any given vector with high probability. Thus, in practice it suffices to select a random subset of the rows 
from a M Si ,k matrix. 

Corollary 2. Let k, N, e~ l e N + , a e [2/3, 1), S c [0,N) n N, and x e C N . Set K = c ■ (k/e) llog Si n\ + 1 for 
Si e N and a constant integer c € [14, oo). Form an mxN measurement matrix M Si ,k as per Section\3\ Finally, form 
a multiset of the Sj values for At Sl K by independently choosing 



(17) 



21 In 



Si values uniformly at random with replacement. Denote this multiset of S; values by 



Then, with probability at least a the resulting random matrix, Ai$, will have the following property: For all n £ S 
more than 1/2 of the S; h € S (counted with multiplicity) will have 



-> -opt 

T — T 

(k/e) 



Proof: See Appendix IB1 □ 



Notice that Corollary [2] considers selecting a multiset of rows from a M Si ,k measurement matrix. In other words, 
some rows of the measurement matrix may be selected multiple times. If this occurs in practice, one should consider 
any multiply selected rows to be chosen more than once for counting purposes only. For example, during matrix 
multiplication a multiply selected row should be processed only once in order to avoid duplication of labor. However, 
the results of these calculations should be considered multiple times for the purposes of estimation (e.g., in the median 
operations of Algorithmic. 

We will now consider these mxN matrices, M Si ,k, with respect to the discrete Fourier transform. In particular, we 
will consider using M Si ,k to estimate the Fourier transform of a periodic function along the lines of Theorem[4] As we 
shall see, the special number theoretic nature of our matrix constructions will allow us to estimate Fourier coefficients 
of any periodic function by using a small number of function samples. 

3.2. The Fourier Case. Suppose / : [0, 2n] —> (D is a complex valued function with / e Z . Let P be the least 
common multiple of |N, Si, . . . , s^} and form a set of samples from /, A £ (D p , with 

2n' 



A P = /(f7) for P e [°' P ) nE 



Ultimately, we want to use AL^x/ in order to estimate the entries of the N-length vector /. However, we must first 

-* -* 

calculate AI Si ,k/- In the remainder of this section we will discuss how to calculate M Sl ,Kf <= C" while using as few 
samples from / as possible in the process. 

To solve this problem we will use an extended version of our mxN matrix AI Si ,k- This extended matrix, G Sl ,K, is 
the mxP matrix formed by extending each row r;j, of M Si ,k as per Equation|4]for all p e [0, P). We now consider the 

product of &si,k a nd the P X P discrete Fourier transform matrix, ^V, defined by = p ' e ■ For each row r 
of £ Si ,k and column p of W we have 



(18) 



V ' r i,h<V P 

1=0 



-2ni-p-h Sj 



1=0 







if p = mod 
otherwise 



Thus, G Si ,k • *r is highly sparse. In fact, we can see that each ry, row contains only sy nonzero entries. Better still, 
all the rows associated with a given sy have nonzero column entries in a pattern consistent with a small fast Fourier 

transform. This aliasing phenomena results in a fast algorithm for computing & Si ,k • • A (see Algorithm[TJ. Lemma[5] 

shows that B Sli x^¥A is a good approximation to M Slr Kf G < C m for all periodic functions whose Fourier transforms 
decay quickly enough. 



Lemma 3. Every entry of£> Sli t*S l A approximates the associated entry of M Sl xf t0 within f - f 
Proof: 



precision. 



Suppose that N is odd (the case for N even is analogous). Then, for all j e [1, K] D N and h e [0, sy) D N, we have 



that 



M Sl , K f-£ Sl ,K^A} 



Yj fh+l-sj- f(a>) 



l, \h+l-Sj\< c - 



o}=h mod s/ 
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l, \h+l-Sj\< r - 



co=h mod Sj 



Algorithm 1 Fast Multiply 



l: Input: Function /, integers k < K <N, relatively prime s^,.. .,s K 

2: Output: £ Sl<K • W ■ A 

3: for j from 1 to K do 

5: ,4.' «- FFT [,C| 

6: end for 

7: Output \A Sl , A S2 , A Sk ) 



Cancelling all Fourier coefficients for frequencies in (—\ n N we get that 



(19) 



^ L \fM\ = \\f-f 



. □ 



) = E a«> 

''''' a>=hmodsp N>^ Ma^y 1 

By inspecting Equation [L8l it is not difficult to see that Algorithm [T] utilizes exactly m — (K — 1) samples from /. 
Considering this in combination with Theorem|5]in Section [3] leads us to the conclusion that AlgorithmQ] samples / 

at O I — ■ — J 1 — distinct values. Similarly, we can see that Algorithm[lJruns in time O \hj=i s j l°g s ; ) if we 

calculate the FFTs using a chirp z-transform ll23l . Thus, for well chosen Sj values the runtime will be 

(K-\ \ 





( K > 




o 




= o 




U=i J 


V 



(20) 



= O 



f^-Llog^Nj^ln 2 ^)^) 



using Equation[T3] We will now demonstrate how the specialized m xN matrices, M Si ,k, along with their extended 
mxP counterpart matrices, S Su k, considered throughout Sections[3land l3.2l can be utilized to construct accurate sparse 
Fourier transform methods. 



4. Fourier Reconstruction 



In this section we develop a sparse Fourier transform based on the measurement matrices considered in the previous 
section. This sparse Fourier method is entirely dependent on the ability of our developed measurement matrices to 
accurately estimate any entry of a vector with which they have been multiplied (i.e., Theorem|4|i. The idea behind the 
algorithm is simple. We first quickly approximate the product of a Section [3] measurement matrix with the Fourier 
transform of an input function using Algorithm Q] We then use the this product to accurately estimate all Fourier 
entries, keeping only the largest magnitude estimates for our final sparse Fourier approximation. See Algorithm [2] for 
pseudo code. Theorem|6]provides error, sampling, and runtime bounds for Algorithmic 



Theorem 6. Suppose f : [0, 2n] 
output an x$ € <D N satisfying 



<Chas f e I 1 . Let N,fc,e _1 eN-(l) with N > (k/e) > 2. Then, Algorithm\2\will 



(21) 



f-x s 



f-fT 



lie- 



f~f< 



opt 

(k/e) 



-Ik 



■22Vfc- 



In the process f will be evaluated at less than 

^ 2 [l°g(^)Nj 2 , (5.89-fc[log (t/e) Nh 



26.02 • 



■In 



Algorithm 2 Fourier Approximate 1 



1: Input: k,N,e 1 e N- {1}, Function/, Measurement matrix M Su k with K = 4- (k/e) |_log Si Nj + 1 (see Section[3]l 

2: Output: Xg, an approximation to f k ° pt 
3: Initialize S <— 0, x <— On 

4: fis^K^A «- AlgorithmQl/, fc, K, N, Sj values for M Si ,k) 
5: for co from 1 - [f j to [f J do 

6: Re {x„} <- median of multiset jlRe j(s si/K< ^A).J 1 1 < j < Kj 

7: Im {xj <- median of multiset |lm |(S s1 jk, (U ^E''A).| 1 1 < j < k| 

8: end for 

9: Sort x entries by magnitude so that \x ail \ > \x^ 2 \ > \Xm 3 \ > ■ ■ ■ 

10: S <- {coi,co z ,...,con\ 

11: Output Xs 



points in [0,2rt]. 77ze runtime of lines 5 through 11 is O (N • (k/e) log^ e ) A[). 
Proq/: 

Fix £U e (- If j , If I] n Z and let 5 be set to 



6 = 



f_ /OP 1 

7 ■'(t/e) 



As a consequence of Theorem|4]and Lemma[3]we can see than more than half of the K = 4 ■ (k/e) I log S] ivj + 1 entries 

of Ss^K/uNfM produced in line 4 will satisfy J^j^^A) . - f a 
7 will have 



< 5. Therefore, the x a , value produced by lines 6 and 



(22) \Xa>-fco\ < V5-& 

Since Equation|22]holds for all to e ( _ |~7~| ' |_7 J] n ^ we can begin to bound the approximation error by 
f-x s < f-fs + fs-xs < f-fs +2Vfc-6 

2 2 2 2 

(23) 



/-/i 



opt 



+ E Itf- E Ia.| 2 + 2 ^- 6 - 



a>eS° pt -S 



a>eS-S° pl 



In order to make additional progress on Equation [23] we must first consider the possible magnitudes of / entries at 
indices in S - S° pt and S° pt - S. 

k k 

Suppose co e S^ pt - S and let co e S — S^ pt . Line 10 will only have placed co € S instead of co if |x<jj| > IxJ. 
However, this can only happen if 

\fl k \ + V2-6 > \fl\ + V2-6 > \fl\ - V2-6 > \fl k \ - V5-& 

In other words, all elements of S - S° pt and S° pt - S must index / entries with roughly the same magnitude as the k th 

largest magnitude entry of / (up to a 5 factor). Furthermore, since |S| = 2k we can see that |S - S° pt | > 2 • |S° pt - S|. 
We are now ready to give Equationl23lfurther consideration. 
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If S° pt - S = we are finished. Otherwise, if S° pt - S + 0, we will have 



£ |/;| 2 >2.|Sf-S|.(|/; f |-2V2.6) 2 = A, 



«es-sr 



and 



B = |S° pt - S| ■ + 2 V2 • 5) 2 > Yj \f» 



oeS%*-S 



If A > B then we are again finished. If A < B then 

A |2 



\fl k \ -12V26-I4I + 86 2 <0 

which can only happen if \f Wlt | e ((6 V2 - 8) • 5, (6 V2 + 8) ■ fi). Therefore, in the worse case we can continue to bound 
Equation|23lby 



f-X S 



opt 



+ k ■ (8 V2 + 8) 2 • 5 2 + 2 Vfc • <5 < 



/-/i 



opt 



+ 22 Vfc ■ <5. 



The error bound in Equation l2"T1follows . 

The upper bound on the number of point evaluations of / follows directly from the application of Theorem|5] with 
c = 4. Finding the largest 2k magnitude entries of x in lines 9 and 1 can be accomplished in 0(N ■ log k) time by using 
a binary search tree (see [ 16|). Therefore, the runtime of Algorithmic will be dominated by the median operations in 
lines 6 and 7. Each of these medians can be accomplished in 0(K) time using a median-of-medians algorithm (e.g., 
gl). The stated 0(N ■ K) runtime follows. □ 

Note that the overall runtime behavior of Algorithm [2] will be dictated by both Equation [20] and the runtime stated 
in Theorem[6] However, for most reasonable values of sublinear sparsity (i.e., whenever k/e is 0(N/ log 3 N)) the total 
runtime of Algorithmic will be O (N ■ (k/e) log^^ NJ. One strategy for decreasing the runtime of Algorithm|2]is to 
decrease the number of measurement matrix rows, K, required to accurately estimate each Fourier coefficient. Pursu- 
ing this strategy also has the additional benefit of reducing the number of function evaluations required for approximate 
Fourier reconstruction. However, in exchange for these improvements we will have to sacrifice approximation guar- 
antees for a small probability of outputting a relatively inaccurate answer. 

Following the strategy above we will improve the performance of Algorithm[2]by modifying its input measurement 
matrix. Instead of inputing a AI S1i k measurement matrix as constructed in Section[3]we will utilize a randomly con- 
structed Aig measurement matrix as described in Section [XT1 Corollary [2] ensures that such a randomly constructed 
Ai§ matrix will be likely to have all the properties of M Si ,k matrices that Algorithmic needs. Hence, with high proba- 
bility we will achieve output from Algorithmic with the same approximation error bounds as derived for Theorem[6] 
Formalizing these ideas we obtain the following Corollary proved in Appendixlcl 

Corollary 3. Suppose f : [0,2n] -> C has f E P. Let a 6 [2/3, 1) and N,/c,e _1 e N - {1} with N > (k/e) > 2. 
Algorithm\2\may be executed using a matrix Aig from Section [PI in place of the matrix Ai Sl ,K from Section\3\to 
produce an output vector x$ e (D N which will satisfy Equation]2l}with probability at least a. In the process f will be 
evaluated at less than 



15.89 • 



21 - In 



In 



+ In In 



points in [0, 2n]. The runtime of lines 5 through 11 will be O (N ■ log (]^))- 
Proof: See Appendix|C] □ 

When executed with a random matrix Aig as input the overall runtime complexity of Algorithm|Cwill be determined 
by both the runtime stated in Corollary |3] and the runtime of Algorithm Q] Suppose S is a subset of O^log^^)) Sj 

13 



values defined as per Equations[9]-[H] Then, Algorithm[T]will have a runtime complexity of 



O 



1,-eS 



(24) 



Y^Sj ■ log Sj = o(p q+ K-iogp q+ K-log(j— -\\ (see Equation[T3]i 



))- 



O 



~ lo § 



•log 



N 
l-o 



Thus, Algorithm [2] executed with a random input matrix from Section 13.11 will have a total runtime complexity of 
O (N ■ log (1^7)) whenever (k/e) is 0(N/ log 3 N). If we now set the desired success probability, a, to be 1 - 1 /N°^ 
we obtain an overall 0(N ■ log N) computational complexity for Algorithmic This matches the runtime behavior of a 
standard fast Fourier transform while requiring asymptotically fewer function evaluations. 

In the next section we will discuss methods for further decreasing the runtime requirements of Algorithm [2] while 
maintaining its approximation guarantees (i.e., the error bound in Equationl2"Tl>. As a result we will develop sublinear- 
time Fourier algorithms that have both universal recovery guarantees and uniformly bounded runtime requirements. 

5. Decreasing the Runtime Complexity 

Let 3K, T> be m X N and mxN complex valued matrixes, respectively. Then, their row tensor product, © 23, is 
defined to be the (m ■ m)xN complex valued matrix created by performing component-wise multiplication of all rows 
of M with all rows of S. More specifically, 

(Jl ® S)i j = Mi mod m,j ■ Bi-imndm j. 



In this section we will use the row tensor product of two types of specially constructed measurement matrices in order 
to improve the runtime complexity of Algorithmic One of these matrix types will be the M Si ,k measurement matrices 
developed in Section[3] The other type of matrix is described in the next two paragraphs. 

Suppose that anmxN measurement matrix, M Si ,k, is given. Furthermore, suppose that Si, . . . , s# e IN are such 
that there exist A integers, t\ < ■ ■ ■ < tx < Sj, with 



n 

!=1 



U> - 



that also have the property that the set 



..,t A ,s 1 ,...,s K } 

is pairwise relatively prime. Note that such £; values can indeed be found if all the given Sj values are prime numbers 
and Si > log 2 N • (lnlog 2 N + lnlnlog 2 N) > PLiog 2 Nj forN > 64 (see iflOll '). We will now demonstrate how to use 
such f, values to create an fh X N matrix, Nx^, along the lines of Section[3] 

Create a row, fy,, in N\ iSl for each possible residue of each f, integer (i.e., fy, has i e [1, A] D N and h e [0, f/) n N). 
The 11 th entry of each fy, row, n e [0, N) fl N, will be 

(fi,h)n = d((n- h) mod U) 



(25) 



We then define 



ifn = h mod f; 
otherwise 



In 
f i,o 



(26) 



N A , 5 



r Xh-\ 



The result is an (m = 1 + Y^t=i ^i) x N matrix with binary entries. The following Lemma, proven in AppendixiDl upper 
bounds the smallest possible number of rows in any such A/a, Si matrix. 
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Lemma 4. Suppose that N, S\, . . . ,Sk £ N w;f/z 



N 



> Si > 



ln(N/ Sl ) 



In 



3- 



rn(N/ Sl ) 



+ In In 



ln(N/si) 
In In (ATM) 



Inln(N/si) \ lnln(N/si) 

and Si, . . . , Sk containing no prime factors less than S\. Then, there exists a valid m X N measurement matrix, Na,s v 
with a number of rows 

\2 



m < 



3- 



In (N/ Sl ) 
lnln(N/si) 



+ 1 -In 



3- 



ln(N/si) 
lnln(NM) 



+ 1+1. 



The corresponding value of A is [3 • ln(N/si)/ lnln(N/si)~|. 
Proof: See Appendix iDl □ 

The (m • m)xN row tensor product matrix, "Ra.k — M Su k®Nx iS i< has several useful properties. First, the fact that the 
first row of A/^s, is the all-ones vector means that 7?a,k w iH contain a copy of every row of Ai Sl ,K- Second, all Ka,k rows 
that are not copies of M Si ,k rows will have the form r^jj, = m0( j Sj .©f,y, mo d f. for some i e [1, A] n N, /' e [1, X] n N, 
and h e [0, f; ■ s,) n N. That is, the Chinese Remainder Theorem tells us that each such Ha.k row will have its n th entry 
given by 



(27) 



/ i 1 ifn = /i mod f;- s; 

(''i / ; / ft)n = 0\[n- h) mod • s/J - 



otherwise 



The end result is that "Ra,k maintains a rigid number theoretic structure. The following Lemma summarizes the most 
important properties of Ha.k = Ms u k ® Na, Si - 

Lemma 5. Let k, e~ l , si, A, n 6 [2, N) n N, £ e C N , anc/ K = 4 ■ (fc/e) I log Sj Nj + 1. Then, more than j of the K 



entries of M s ,,K,n ' x will estimate x n to within 5 



1 precision. Furthermore, if Ty ,-nmods ■, G !0, 1} is a row 



ofM Sl ,K,n associated with one of these more than § entries then it will have all of the following properties: 



(1) 

(2) 

(3) 



?f,nmodSj> ' X %n\ ^ <5» 

yj',n modSj' ® ^i,n mod tA " X — Xn 



= r, 



i f f ,n mod tj-s. 



• x — x n \ < 6 for all i e [1, A] D N, ara/ 



mod Sir © » iJ 



;,)■£ = mod f .. s -x < 5 /or a/Z i £ [1, A] Pi N and /i e [0, £;) D (N - \n mod U\). 



Proof: See Appendix|E| □ 



Suppose / : [0, 2n\ — > (D is a complex valued function with f e I 1 . It is not difficult to see that < R\,kJ can be 
approximated using Algorithm [T] from Section [3~2l since maintains the required number theoretic structure. We 
will simply perform FFTs on aiTays of function samples with sizes given by all possible £; • s; value products. The total 
number of function samples taken will be at most m ■ rh — (A • K + K — 1). For s; and f, values chosen as per Theorem|5] 

and Lemma|4] respectively, the runtime required by Algorithm[T]to approximate 7?a,k/ will be 

( A K-l 



O 



( A K 



o 



i=\ j=0 



•1+1 



(28) 



O 



= O 



(see [15]) 



In p A 

l n 2 N .l n 2(HsN). ln 2^y 



e 2 -]B 2 (|)-lnln(f) J 

The last equality follows from Equation Qj] and the Prime Number Theorem. Finally, it is not difficult to see that the 

precision guarantees of Lemma[3]will still hold for an AlgorithmQ] approximation to Ha.k/- 

Perhaps most importantly, the number theoretic structure of Ka,k also allows us to use methods analogous to those 
outlined in Sections 1 . 1 and 5 of |[T4l to quickly identify frequencies with large magnitude Fourier coefficients in /. 
Suppose that |^,| is large relative to ||/|| (e.g., more than one tenth as large). In this case Lemma[5]above tells us that 

fa, will also have a magnitude nearly as large as that of most entries of M S1i k,(oJ- Let Tj^ mods.- be the row of M Si ,k,w 
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Algorithm 3 Fourier Approximate 2 



l: Input: k, N, e 1 e N - {1}, Function /, An (m • m) X N measurement matrix 7?a,k with K = 4 • (k/e) |log Si Nj + 1 

2: Output: xs, an approximation to f k ° pt 
3: Initialize S <— 0, x <— On 

4: Qa,k^A <- AlgorithmQI/, fc, X, N, sy and f; values for 7? A ,£) 

~ - -» ' 

5: S^WA <— The m entries of Qa^A that approximate M Si ,Kf 

Identification of Frequencies with Large Fourier Coefficients 

6: for j from 1 to K do 

for ft from to sy - 1 do 



for j from 1 to A do 
frmin <- argmin;, e [ , fi) 



aj,h,i <-\h + Vin • sy) mod f; 
end for 

Reconstruct coij, using that co^ = h mod sy, GJy,/, = ay^i mod t\, . . ., com = aj,h,A m °d 
end for 
end for 



Fourier Coefficient Estimation 
t% vaiue reconstructed > 4 



15: for each con, value reconstructed > $ times do 



16: Re [x^} <- median of multiset j]R<e {({?a,k^,^A).J | 1 < j < K • (A + 1)J 
17: lira «- median of multiset jlin {(^K^^A). j 1 1 < j < K • (A + 1)1 



end for 

Sort nonzero x entries by magnitude so that \x Wl \ > \x C02 \ > \x m \ > ... 
S <- [co\, co 2 , C0 2 k\ 
Output xs 



associated with one of these AU^K&f entries dominated by f a . By its construction we know that ??a,K will not on ly 



contain rj A , mo ds,> but also the related rows fi,^ mod h-s,. 



r a, j A , mod t, v s r Furthermore, all A + 1 entries of Hx^mf 



associated with these rows will also be dominated by f a (see Lemma 0. On the other hand, for each i e [1, A] D N 
the |7?a,k,£u/ J entries will all be significantly smaller than f a , in magnitude. Hence, by comparing the relative 

magnitudes of the entries in (rj jCO mo( j s . © A/a /Si ) f we can discern co mod sy, co mod f i • sy, . . . , co mod t,\ • sy. The end 

-* 

result is that co can be recovered by inspecting 7?a,k,£u/- See lfl4l for a detailed discussion of a similar recovery 
procedure. Utilizing these ideas we obtain Algorithmic] 

Note that Algorithms [2] and [3] are quite similar. The only significant difference between them is that Algorithm [2] 
estimates Fourier coefficients for all frequencies in the bandwidth specified by N whereas Algorithm[3]restricts itself 
to estimating the Fourier coefficients for only a small number of frequencies it identifies as significant. Given these 
similarities it should not be surprising that demonstrating the correctness of Algorithm[3]depends primarily on showing 
that it can coiTectly identify all frequencies with coefficients that are sufficiently large in magnitude. This is established 
in Lemma|6]below. 



Lemma 6. Suppose that co 6 (— , fl TL is such that 



\fJ > 4- 



£ ■ 



J J(k/e) 
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Then, lines 6 through 14 of Algorithm\3\will reconstruct a) more than If times. 



Proof: 



Suppose that co e (- f^l , I ^ I] n % has |/~,| > 45 where 



/-/; 



opt 

(k/e) 



Lemma [5] and Lemma [3] guarantee that |(£ S1/ k /W \IM^, - f a 
f e [1,K\ n N is one of these more than f indexes, then property (2) of Lemma [5] together with the preceding 



< 6 for more than If entry indexes Furthermore, if 



discussion of Lemma [3] also ensures that 

itj'jG) mod fj- -s^-. 

Thus, if b e [0, f;) n N in hne 9 of Algorithm^ satisfies 



< 5 for all i e [1, A] n N. Fix i e [1, A] n N. 



(29) 

we can see that 



= ((a; mod Sy) + b ■ sA mod f; ■ Sj/ 



V mod ... V 'i 



: ,j',(ai mod y)+fty 



'f,a mod Sj-/ 



j',to mod f ( -: 



<2S. 



Otherwise, if b e [0, f,) n N does not satisfy Equation[29l property (3) from Lemma[5]in combination with Lemma[3] 
ensures that 



25 < \fj - 



f ,cd mod S:, 



',j' r ^ci> mod sy^+b-s 



0* 



mod s./ 



Therefore, the & = £> m j n identified in line 9 of Algorithm[3]will be guaranteed to satisfy Equation[29]for all i e [1, A]nN. 

Once we have identified co mod f, ■ Sf in this fashion we can find co mod £, in line 10 of Algorithm|3]by computing 
(co mod i; ■ Sy ) mod f,. Finally, by construction, the set {t\, . . . , t\, Sj>] both has a collective product larger than N, 
and is pairwise relatively prime. Therefore, the Chinese Remainder Theorem guarantees that line 12 of Algorithm^ 
will indeed correctly reconstruct co when j = f and h = co mod sii . □ 

With Lemma [6] in hand we are now prepared to prove that Algorithm [3] can indeed recover near-optimal sparse 
Fourier representations in sublinear-time. We begin by using Lemma|6]to show that all sufficiently energetic frequen- 
cies are guaranteed to be identified. Hence, the only way Algorithm|3]will not include an optimal Fourier representation 
frequency in its output is if the frequency is either (i) insufficiently energetic to be identified, or (it) gets identified, 
but is then mistakenly estimated to have a smaller magnitude Fourier coefficient than many other somewhat energetic 
frequencies. In the case of (i) it is forgivable to exclude the frequency given that it must have a Fourier coefficient with 
a relatively small magnitude. In the case of (it) we make up for the exclusion of a truly energetic frequency term by 
including many other less significant, but still fairly energetic, frequency terms in its place. Carefully combining these 
ideas leads us to the error, sampling, and runtime bounds for Algorithm[3] stated in Theorem[7]below. 

Theorem 7. Suppose f : [0,2n] -> C has f £ f. Let~N,k,e~ Y eN-jl) withN > (k/e) > 2. Then, Algorithm\3\will 
output an x$ £ <C N satisfying 



He- 



rn 



f-xs 



opt 



f~f< 



opt 

(k/e) 



■22Vfc- 



Under the conditions of Lemma^\ f will be evaluated at less than 

|2 

" ' ' " In (eN/k) 



19.52 ■ ^!^] . In f ^ ' k ^ ^ 1 



3- 



lnln(eN/fc) 



+ 1 -In 



\n(eN/k) 
hx1n(eN/k) 




/ ^-log 2 N-log( )log 2 ( ^ ) \ 

points in [0, 2n]. 77ze runtime of lines 6 through 21, as well as the number of f -evaluations, is O y — i g 2 ( * ) g ^ log iog( ) /' 



Proof: See Appendix|F| □ 

The overall runtime behavior of Algorithm [3] is determined by both the runtime of Algorithm Q] as called in line 
4 of Algorithm [3] and the runtime stated in Theorem [7] The overall runtime complexity of Algorithm [3] is therefore 
given in Equation [28] As in Section [4] above, both this runtime and the number of function evaluations required for 
approximate Fourier reconstruction can be decreased by reducing the number of measurement matrix rows (i.e., Ka,k 
rows) used to estimate each Fourier coefficient. This effectively replaces K in Algorithm[3]with a significantly smaller 
value (e.g., the value I from Corollary |2). However, in exchange for the resulting runtime improvements we will once 
again have to sacrifice approximation guarantees for a small probability of outputting a highly inaccurate answer. 

Following the strategy above, we will improve the performance of Algorithm[3]by modifying its utilized measure- 
ment matrix as follows: Instead of using a M Si ,k matrix as constructed in Section[3]to build Ka,k = A1 Sl ,x®-^A,si> we 
will instead use a randomly constructed yVlg matrix as described in Section [3~Tl to build H A g = Atg © A/a, S] ■ Corollary [2] 
combined with the proof of Lemma [5] ensures that such a randomly constructed measurement matrix, H A g, will be 
likely to have all the properties of Ka,k matrices that Algorithm [3] needs to function coiTectly. Hence, with high prob- 
ability we will receive output from Algorithm [3] with the same approximation eiTor bounds as derived for Theorem[7j 
Formalizing these ideas we obtain the following Corollary proven in AppendixlGl 



Corollary 4. Suppose f : [0,2n] -> C has f e I 1 . Let a e [2/3, 1) and N,k,e~ l eN-(l) with N > (k/e) > 2. 
Algorithm\3\may be executed using a random matrix, 7? A j = M§ ® Na^, in place of the deterministic matrix, 
ft\,K = M Si ,k ® Na,s-i, considered above. In this case Algorithm\3\will produce an output vector, Xs e C N , that 
satisfies Equation\30\with probability at least a. Both the runtime of lines 6 through 21 and the number of points in 
[0, 2n] at which f will be evaluated are 

Explicit upper bounds on the number of point evaluations are easily obtained from the proof below. 
Proof: See Appendix[G] □ 

When executed with a random matrix, K A g, as input the overall runtime complexity of Algorithm [3] will be deter- 
mined by both the runtime stated in Corollary[4]and the runtime of AlgorithmQ] Suppose S is a subset of O (log ( jr^)) 
Si values defined as per Equations [9]— [TT] Then, Algorithm[T]will have a runtime complexity of 



O 



'=1 s.-eS 



= O 



log Pa 



(see Q3), Corollary© 



( 



(31) 



= O 



^log (fc/e) N.log 2 (™).ln 2 (^) / N \ 

log(— ) 



(see Equation[T3l Lemma[4|. 



e-lnln(^) 

Thus, if we are willing to fail with probability at most 1 - a = 1/N 0< - 1 \ then Algorithm [3] executed with a random 
input matrix will have a total runtime complexity of O ((k/e) ■ log 4 N ■ log( ° gN ))- 

6. Higher Dimensional Fourier Transforms 

In this section we will consider methods for approximating the Fourier transform of a periodic function of D vari- 
ables, / : [0, 2n] D — > (D. To begin, we will demonstrate how to approximate the Fourier transform of / by calculating 
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the discrete Fourier transform of a related one-dimensional function, _f new : [0, 2n] — » <D. This dimensionality reduc- 
tion technique for multidimensional Fourier transforms will ultimately enable us to quickly approximate f by applying 
the methods of Section[5]to f's related one dimensional function f new . The end result will be a set of algorithms for 
approximating / whose runtimes scale polynomially in the input dimension D. 

Suppose that the Fourier transform of / above, / : Z D — » (D, is near zero for all integer points outside of the 
D-dimensional cubic lattice ([-M/2,M/2] n Z) D . In order to help us approximately recover / we will choose D 
pairwise relatively prime integers, P\, . . . , Pd £ N, with the property that > M ■ D for all d e [1, D] n N. Set 
N = IlrfLi frf- Furthermore, let y -1 modp e [0,p) fl N denote the multiplicative inverse of (y mod p) e Z p when it 
exists. Note that y -1 modp will exist whenever y is relatively prime to p. 

We may now define the function / new : [0, 2n] — » (D to be 



(32) 



/newW — / 



N N 



Considering the Fourier transform of f new we can see that 



fnew(co) 



2hJ 



2n 

(wi,...,w D )e.TP 



N 



f{co\,...,oju) 



pin 

Jo ' 



dx 



(33) 



E 



f(coi,..., cod)- 



{fi>l,,..fi> D )£% D s.t. £u=£y =1 f- d OJ d 



Recall that we are primarily interested in capturing the information about / inside ([-M/2,M/2] n Z) . Looking at 
the d£Z for which _/n ew can impacted by {co\, . . ., cod) e ([~M/2,M/2] n Z) D we can see that 



rf=i 



2P7 < 2-i 2D " 2~' 

rf=i 8 rf=i 

Hence, we may consider / new to have an effective bandwidth of N. 

More importantly, there is abijective correspondence between the integer lattice points, {co\, . . . ,cod) € ([— M/2,M/2] D Z) D , 
and their representative frequency, co € [— N/2, N/2] fl Z, in / ne w Define the function 



Pi Pa 
" 2 ' 2 



n N 



N N 
T' "2 



to be 



8{xi xd) H§(^ 



•Xrf 



mod N. 



The Chinese Remainder Theorem tells us that g is a well-defined bijection. Furthermore, it is not difficult to see that 

-i ( / _ \— 1 mod Pi / „ \-l modPn \ 

g~\x) = ix ■ (N/Pi) mod Pi, . . . , x ■ (N/P D ) mod P D j . 

Thus, we have ft^{co) ~ / (g -1 ^))- 

We now have a three-step algorithm for finding a sparse Fourier approximation for any function f : [0, 2n] D — » 

(D. All we must do is: (z) Implicitly create / new as per Equation [32] (z'z) Use the techniques from Section [5] to 

-* -* 

approximate f new , and then (z'z'z) Use the approximation for / new to approximate / via Equation [33] The following 
theorem summarizes some of the results one can achieve by utilizing this approach. 

Theorem 8. Suppose f : [0, 2n] D — > (D is bandlimited so that f(co\, cod) = if{co\, cod) € ([- y, y ] n z)°. 
Define N as above and suppose that N, k, e -1 e N — {1} wz'f/z N > (k/e) 2 > 4. 77zen, Algorithm\3\combined with the 
bijective mapping, g, above will output an x$ e (D N satisfying 



(34) 



(*s o , 



22e- 



V ./(fc/e) 



Both the runtime of lines 6 - 21, and the number of points in [0, 2n\ D at which f will be evaluated, will be 

V-D 4 -log 4 (MD)' 



l 

O 



If succeeding with probability a e [2/3, 1) is sufficient, and N > (k/e) > 2, Algorithm\3\may instead be executed 
using a random matrix H A g. In this case Algorithm\3\will produce an output vector, x$ e <C N , that satisfies Equation \34\ 
with probability at least a. Both the runtime of lines 6 - 21, and the number of points in [0, 27i] D at which f will be 
evaluated, will be 



O I . log 3 (MD) ■ log 



Finally, if an exponential runtime o/Q|(DM) d | is acceptable, we note that both Theorem\6\and Corollary\3\can 
also be adapted to recovering f : [0, 2n] D — > <D by substituting N with & |(MD) D j everywhere in their statements. 

Proof: See Appendix [Hi □ 

Note that traditional FFT algorithms (e.g., E|23][E|) require q(m d ) -time to calculate the Fourier transform of a 
bandlimited function / : [0, 2n] D — » (D. In contrast, Theorem[8] allows us to approximate / using exponentially fewer 
(in D) operations. Hence, if f has a relatively sparse Fourier representation (e.g., if f is dominated by k - M ^ 
energetic frequencies), Theorem [8] allows / to be accurately approximated much more quickly than possible using 
standard techniques. 

7. Conclusion 

In conclusion, it is worth pointing out that the methods developed in this paper for approximating the Fourier 
transforms of periodic functions are also applicable to the approximation of functions which have accurate sparse 
representations in related bases. For example, all the theorems proven herein will also apply to functions with sparsely 
representable Cosine or Chebyshev expansions (see (3) for an in depth discussion of the relationships between these 
series expansions). Hence, we have also implicitly constructed sublinear-time algorithms for approximating these 
related transforms. 
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for all n > 599. Using this result (in combination with numerical tests for n < 600) we obtain the following bounds 
for q + K and q (see EquationfTTI). 
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Appendix A. Proof of Theorem[5] 
Let n(n) be the number of primes no greater than n. In iflOl it is shown that 




(35) q + K< n(k/e) + K+l < 



fc K*/e) N J 



1 1.2762 2 • e 



\ 



£ 



Mk/e) ■ [log m N\ ln 2 (k/e) ■ [log m N\ k ■ [log m n\ 



and 



(36) 




Continuing, we can bound m if our s, values are chosen to be primes as above by noting that 



9-1 <?-! 



£p;>£/-ln(;) (seeHOl) 

;=1 7=1 



(37) 
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and 



q+K-l 

£ Pj < 10 + J ■ MPj) (see HO)) 



(38) 



(39) 



q+K-l 

L 

< 10 + ln(p, +K ) 



(q+K-l \ 



E 



< ^ + K ^ q + K) . In + K) ■ (ln(q + K) + In ln(q + K)) ) (see ED) 



< -(q + K) 2 -ln(q + K). 



Using Equation[35]together with Equation[39] finishes the proof. More specifically, we have that 



c + 



1.2762 



2-e 



■ [log m N\ W(k/e) ■ [log m N\ k ■ [log m N\ J 
Therefore, we can see that 

q+K-l 



1 1.2762 1\ 
^ IC+ ^4 + W + 2] £U + J ' 8t)) - 



m < Yj Pi- 

as we wished to prove. 



3(c + 1.89) 2 -fc 2 [log (W Nj f( C + 1.89)-A:[log (fc/£) Nj , | 
■ In 



4-e 2 



Appendix B. Proof of Corollary[2] 

We prove the result via an argument similar to the one used to prove Lemma 2 in |[T4| . Fix n e S. We will 
select our multiset of s; values, S, by independently choosing I elements of {si, S2, . . . , Sk] uniformly at random with 
replacement. The first element chosen for S will be denoted Sj\, the second Sj 2 , and so forth. Let Q" t be the random 
variable indicating whether the Sj h value selected for S satisfies 



(40) 

Therefore, 



e ■ 



-> -opt 
X ~ X (k/e) 



1 if Sj h satisfies Property l40l 
otherwise 



TheoremHtells us that F [QJ| = l] > f . Furthermore, p. = E [eJ, = i QJj] > y • 
Using the Chernoff bound (see [18]) we get that the probability of 

h=l 

is less than e~is < < lip. Since / > 21 we can see that ]£fc=i QjJ will be less than with probability less than 
fe?. Hence, Property l40l will be satisfied by more than 1/2 of the € S with high probability. Applying the union 
bound shows that the majority of the entries in S will indeed satisfy Property |40]for all n e S with probability at least 
a. The result follows. 



Appendix C. Proof of Corollary [3] 
Apply Corollary |2] with c = U, x = f, and S = (- [f 1,1 f I] n TL to obtain S, a multiset of [21 ■ Wj^)] Sj 
values. With probability at least a more than half (with multiplicity) of the entries of Aig, cof will estimate f cu to 



within (e/fc) • 



f-ft 



opt 

We) 



precision for all 00 e (- [f ] , |_f J] n TL. Furthermore, Mgf can still be approximately 

computed using Algorithm Q] if only the unique s ; values in S are given as the relatively prime inputs. In this case 
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Lemma[3]will also still hold. Taken all together we can see that with probability at least a all N x a , values produced 
by lines 6 and 7 of Algorithm|2]will have 



■fJ< V2- 



e ■ 



f-fr 



(k/e) 



f-f 



The Equationl2Tlerror bound now follows from the proof of Theorem|6] 

To upper bound the number of required function evaluations we will bound the number of rows for a particular Aig 
matrix constructed with primes as per Section [3] In particular, we will assume that S contains at most |~21 ■ lnf^^^j 

individual s; values defined as in Equations|9]-[II]with K = 14 ■ (k/e) I log Si A/j + 1. In this case Equation |35l together 
with results from 1 10 1 tell us that Sk is at most 



(41) 



15.89- 



In 



+ In In 



The stated upper bound on the number of required function evaluations follows. The stated runtime follows from the 
fact that each line 6 and 7 median now only involves O (log ( jr^fj values. 

Appendix D. Proof of Lemma@] 

We can always set t\ = p\ < ■ ■ ■ < t\ = p y \- In this case we require that p\ < s\ < the smallest prime factor of 
S\, . . . , Sk- Secondly, we require that Ya=\ ^ n Pi ^ ^-(st) • Using results from ||25ll it is easily verified that 

,i 

J^lnp, > A-(lnA-l) 

i=i 

for all A e N + . Setting A = ["3 ln^ j / In In in the equation above we can see that 

^lnp,>lng).3 



i=l 



1 - 



In In In (I)" 



as long as N/s\ > 3. Hence, if we choose our f; values to be the first A primes the second requirement will be satisfied. 
Results from 1 10] then tell us that 



t,\ -pA < Pf3ta(A//si)/In]n(Wsi)l 



3- 



ln(N/si) 
lnln(N/si) 



In 



3- 



ln(N/si) 
In In (JVM) 



+ In In 



Inln(N/si) 



<Si. 



Therefore, the prime f; values we have selected will also satisfy the first requirement above. To bound the smallest 
possible number of rows we note that 



31n| 



m < 1 + 



E 



!=1 



3- 



ln(NM) 
lnln(N/si) 



+ 1 -In 



3- 



ln(N/si) 
lnln(N/si) 



+ 1+1 (see Equation 



The stated result follows. 



Appendix E. Proof of Lemma \5\ 
In addition to iwe will also consider y 6 <C N defined by 

y n , = \ Xn ,\ for all ri 6 [0,N) n N. 
Note that y and x will not only share the same optimal (fc/e)-term support subset, S° p ' C [0, N) n N, but will also 



have 



-> -opt 
X ~ X (k/e) 



-> -apt 

y-y { kM 



(k/e) 

Theorem|4]tells us that more than f entries of M si xn ' V will estimate y n to within 
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II -> HOpt || || -t -Opt || 

! ? (W||i _ s _ 1 (Will 



precision. Let {M Sll K,n ' P)i> f° r /' e [1, -JC] n N be one of these ^ entries. The proof of 



Lemma|2]tells us that the row associated with this entry also has the property that 



{M Sl , K ,n ■ - X„ 



E 



< 5. 



n'=n mods./, «'£S,, , n'^n 

J (K/ E) 



Therefore, we have established property (1). 

Considering property (2) for this f we can see that for all i e [1, A] D N we will have 



'i,f,n mod tj-Sj' 'X %n\ — 



E 



H'=n mod tj-Sj>, n'gS°^ £ y n'+n 



E 

Si/, n' 

E 



!/»' 



«'=n mod trSif, n'^S^F. n'^n 

1 ; (£/f)' 



< 6. 



n'=n mods,', n'£S°, p ,\, n'+n 



Finally, to verify property (3) we can bound 
(N - {n mod t,}) by 



(rf,n mod S/ © • * from above for all i e [1, A] n N and h e [0, f,) 



n 



E 



n'=n mod S/ ; n'=/? mod t u n'^SfF, \ 



E 



< 5. 



n'=n mods,/, n r $.$.[, ,, n'±n 

1 [Kj £) 



Hence, we can see that all three properties will indeed hold for at least ~ rows of Ai Si/ K,n- 

Appendix F. Proof of Theorem[7] 

Let 5 be defined as 

£ ■ 



<5 = 



r_ f opt 

J J(k/e) 



f-f 



Furthermore, suppose j e [1,K] n N and h e [0, sf) correspond to an cojj, e (— [~ y~| , Pi 7L which is reconstructed 
more than f times by line 12 of Algorithm [3] As a consequence of Lemmas [3] and [5] we can see than more than 



half of the entries of Qx^^A produced in line 4 will satisfy 
produced by lines 16 and 17 will have 



< 5. Therefore, the x (I}jh value 



(42) 



tun-faJ < V2-S. 



Since Equation|42]will hold for all a> 6 f — |"y 1 , 1 y jj D Z reconstructed more than | times, we can begin to bound 
the approximation error by 

■2Vfc-6 



(43) 



-> 

f-x s 
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-* -* 

f-fs 


+ 
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fs-xs 


< 
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-* -* 

f-fs 



f-f] 



opt 



+ \M 2 - E \ff +2 ^- d - 



veST-S 



In order to make additional progress on Equation [43] we must now consider the possible magnitudes of / entries at 

indices in S - S opt and S° pt - S. 

k k 

Suppose co e S° pt - S + 0. In this case either (z) \fo\ ^ 46, or (zz) \ff\ > 45 in which case Lemma [6] guarantees 

that cv will be identified by lines 6 through 14 of Algorithm^ Once identified, and) e S^ pt will always be placed in S 

unless at least k+1 other distinct identified elements, a> £ S? p , have the property that \x$\ > \x a \. Thus, if (zz) occurs 
then 

(44) V2-6 > \fl\ + V2-6 > |/;|- V2-6 > [4| - V2-6 
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will hold for all & e S - S^ pt . The end result is that if a> e S^ pt - S then either \ f w \ < 45, or else f a is roughly the same 
magnitude as f a , k (up to a 0(<5) tolerance). Furthermore, because line 20 chooses 2k elements for S whenever possible, 
we can see that S - S° pt must contain at least 



\M>M\)-s 

-.opt 



elements, a>, all of which satisfy Equation [44] for every oj € IS^ fl |a) > 46|j - S. We are now ready to give 
Equation|43l further consideration. 

If S£ pt - S = we are finished. Otherwise, if S^ pt - S + 0, we can bound the squared Z 2 -norm of 4s° pl from below 

by 

E \faf > 2 ■ (s° pt n {<y I |/;| > 4S}) - S ■ (|/I, t | - 2 V2 ■ bf = A. 



Furthermore, we can upper bound the squared Z 2 -norm of / s °pt_ s by 

k 

(s° pt n {<y [ |/; | > 4s}) - S ■ + 2 V2 ■ df + (s° pt n [ [&| < 45}) - S 



• i66 2 > E |/; 

a,EST-S 



LetB = 



S° pt n{o;| |/ (U |>46[)- 



I j - S ■ (|4| + 2 ^ ' ^) ■ We w ^ now concentrate on bounding 

c= E \M 2 - E Itf- 



If A > B then C < 16k ■ 5 2 . Otherwise, if A < B then 

|4| 2 -12V26-|4| + 86 2 <0 

which can only happen if |4| e ((6 V2 - 8) • <5, (6 V2 + 8) • s). Hence, A < B implies that C < k ■ (8 V2 + 8) 2 • <5 2 . 
Finishing our error analysis, we can see that in the worst possible case Equation|43]will remain bounded by 



-> 
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f-x s 




f-fr 


+ k 

2 



+ /c-(8V2 + 8) • <5 2 + 2 Vfc -6 < 



f-f? 



+ 22Vfc-<5. 



The error bound stated in Equation|30]follows. The upper bound on the number of point evaluations of f follows from 
an application of Lemma|4]and Theorem[5]with c = 4. 

We will begin bounding the runtime of Algorithm [3] by bounding the runtime of lines 15 through 21. Line 12 of 

(iP-loe 2 N loz( k ' lnN ^ 
B( W ^ SV c ) 

will be executed O ^ og w c ) ^ og ( c ) j times apiece. Each such median operation can be accomplished in 0(K ■ A) 



j times (see Equation \\4\. Therefore, lines 16 and 17 



time using a median-of-medians algorithm (e.g., see [9]). Therefore, the total runtime of lines 15 through 21 will 

( kl - lo slie) N-log(^)-log(eN/it) \ 

21 — i — , rrm ■ Turning our attention to lines 6 through 14, we note that their runtime will be 



be O 



dominated by the O 

14 will be O 
follows. 



ic 2 -logf t/e) N.log(^).log( e N/)c) 
e 2 -loglog(eN/Jr) 
F-log 2 t/c) N.log(^).log 2 ( e N/)c) 



£ 2 -loglog(dV/S:) 



j j , x , j executions of line 9. Therefore, the total runtime of lines 6 through 

j (see [15] and Lemma|4]i. The stated overall runtime of lines 6 through 21 



Appendix G. Proof of Corollary |4] 



Define \f\ e R N by 



l/l) = 14 for all e 

25 



n tl. 



Clearly / and |/| will both have the same optimal (fc/e)-term support subset, S?^ . C [0, N) n N. Similarly, it is easy to 

Apply Corollary|2]with c = 14, f = and S = (- [f ] , [f J] n Z to obtain 



see that 



*>pt 

We) 



l/l- l/l 



S, a multiset of [^21 • ln^j^^l Sj values. With probability at least a more than half (with multiplicity) of the entries of 



/_ /°Pj 

7 J (k/e) 



precision for all we S. 



Mg, co ■ |/| will estimate the co th entry of |/| to within (e/k) ■ 

Given the last paragraph, it is not difficult to see that with probability at least a a result analogous to that of Lemma[5] 

-* 

will hold for H A $ ■ f. That is, with probability at least a the following will hold for all aeS: The majority (when 

-* 

counted with multiplicity) of Wig, to rows, re {0, 1} N , will have if® s) • / » f a , for a given row, s) of Na,si if and only 

-* 

if s*is also a row of Na,si,cl>- Furthermore, "R A § ■ f can still be approximately computed using Algorithm [TJ if only the 
unique s; values in S are given as relatively prime sy-inputs. In this case a result analogous to Lemma[3]will also still 
hold since we will merely be computing a subset of the previously calculated vector entries. Finally, by inspecting the 
proof of Lemma [6] we can see that an almost identical result (with K replaced by the I value from Corollary [2]i will 

hold any time ?? A g • / satisfies the aforementioned variants of both Lemmas [5] and [3] 

Taken all together, we can see that with probability at least a both of the following statements will be true: First, 
all at most N x ajjli values ever produced by lines 16 and 17 of Algorithm[3]will have 



Second, a variant of Lemma|6]will ensure that all co 6 S with 



f-ft 



opt 

(k/e) 



f-f 



\fl\ > 4- 



€ ■ 



opt 

(k/e) 



f-f 



are reconstructed by lines 6 through 14 of Algorithm[3]more than ^ — times. The Equation[30lerror bound now 
follows from the proof of Theorem|7] 

We upper bound the number of required function evaluations by bounding the number of rows for a particular 
matrix constructed with |21 -ln^^^ randomly chosen Sj values defined as in Equations l9l - [TTI (with K = 

14 • (k/e) [log Si N\ + 1 V In this case Equation[35]together with results from IfTOl tell us that Sk is itself bounded above 
by Equation [41] The final upper bound on the number of point evaluations of f then follows from an application of 
Lemma[4] Note that the product of the Lemma[4]row bound with |^21 ■ In ( jz^YI and Equation [TTJprovides a concrete 
upper bound for the number of point evaluations of /. 

We will begin bounding the runtime of Algorithm [3] by bounding the runtime of lines 15 through 21. Line 12 



of Algorithm [3] will be executed a total of O 
16 and 17 will be executed O 



*-log(&>Iog We) AMo{ 



/ HogN ' 



times (see Equation |4TT). Therefore, lines 



Mog (t/c) N.log(^ 



j times apiece. Each such median operation can be accomplished in 
O^log^j^^ • /\) time using a median-of-medians algorithm (e.g., see J9]). Therefore, the total runtime of lines 15 

Wog( £)-log (l/e) N-log( ^).log( £ NA) \ 



through 21 will be O 



e-log log(eN/k) 



Turning our attention to lines 6 through 14, we note that 



their runtime will be dominated by the O 



Mog( T ^)-log (t/c) N.log(^).log( g NA) 



runtime of lines 6 through 14 will be O 
of lines 6 through 21 follows. 



e-loglog(eN//c) 

;c.log(^).log (t/c) N.log(^).log 2 ( g NA) 
e-loglog(eN/fc) 
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executions of line 9. Therefore, the total 
(see Lemma[4|. The stated overall runtime 



Appendix H. Proof of Theorem[8] 
Suppose we want to resolve at least M frequencies in each of D dimensions (i.e., we want to approximate the 
D-dimensional array / e <C M °). We begin by choosing the smallest DeN such that 



D-D+l 



Yl Pj > (MDf. 



The first paragraph of Appendix iDlreveals that 



(45) D < 3- 



D • ln(MD) 



+ D 



D • ln(MD) 



ln(D • ln(MD)) ln(D • ln(MD)) 

Furthermore, we can see from ifTOl that 



T + ln(MD) j = °l 



D ■ log(MD) \ 
log(D-log(MD))y 



C ^ D 

(46) In J^py < D-ln(MD)+ V lnpy < D-(ln(MD) + lnp ) < D • (ln(MD) + In (£> ■ (In D + In In £>)) 

, 7=1 J ;=D-D+1 

for D > 2 and (MD) D > 2, 310. Thus, log (n° i Py) is generally 0(D • log(M • D)). We will use these first D primes 
to help define our new one-dimensional function / new (see Equation[32l above) . 
Set Do = and recursively define to be such that 

Ai-l D d 
l\ Pj < MD < J] py 
y=D d _!+i ;=d,i-i+i 
for all 1 < d < D. We then define the D pairwise relatively prime values required for approximating each Fourier 
coefficient, /(g -1 ^)) e (D, via Equation [33] to be 

^= n 

/=D,,-i+l 

for 1 < d < D. Set N = Yld=\ ^ llyli Py- The stated runtime and sampling bounds follow. 

We are now in the position to apply any of Theorem [6] Corollary [3] Theorem [7] or Corollary 0] to approximate 

/new G <D N . Upon applying any of these four results to / new we will obtain (either deterministically, or randomly with 
high probability) a 2/c-sparse xs e <C N satisfying 



/new %S 



/new / new j. 



22e- 



/new / 



opt 

new (k/e) 



22Vfc- 



/new /ru 



Recall that we have only guaranteed that /'s Fourier coefficients for ([-M/2, M/2] n Z) map into f aew 's Fourier co- 
efficients for [-N/2, N/2] n Z (although many others will as well). Thus, for simplicity, we assumed that / is bandlim- 

ited when translating these error bounds back into terms of /. Given this assumption we have / new - / new = 0. 
The stated error bound now follows from the fact that g is a bijection. 
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